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Analysis of Vibrio cholerae Genome Sequences Reveals Unique rtxA 
Variants in Environmental Strains and an rfxA-Null Mutation in 
Recent Altered El Tor Isolates 
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ABSTRACT Vibrio cholerae genome sequences were analyzed for variation in the rtxA gene that encodes the multifunctional auto- 
processing RTX (MARTX) toxin. To accommodate genomic analysis, a discrepancy in the annotated rtxA start site was resolved 
experimentally. The correct start site is an ATG downstream from rtxC resulting in a gene of 13,638 bp and deduced protein of 
4,545 amino acids. Among the El Tor Ol and closely related 0139 and 037 genomes, rtxA was highly conserved, with nine alleles 
differing by only 1 to 6 nucleotides in 100 years. In contrast, 12 alleles from environment-associated isolates are highly variable, 
at 1 to 3% by nucleotide and 3 to 7% by amino acid. The difference in variation rates did not represent a bias for conservation of 
the El Tor rtxA compared to that of other strains but rather reflected the lack of gene variation in overall genomes. Three alleles 
were identified that would affect the function of the MARTX toxin. Two environmental isolates carry novel arrangements of ef- 
fector domains. These include a variant from RC385 that would suggest an adenylate cyclase toxin and from HE-09 that may 
have actin ADP-ribosylating activity. Within the recently emerged altered El Tor strains that have a classical ctxB gene, a muta- 
tion arose in rtxA that introduces a premature stop codon that disabled toxin function. This null mutant is the genetic back- 
ground for subsequent emergence of the ctxB7 allele resulting in the strain that spread into Haiti in 2010. Thus, similar to classi- 
cal strains, the altered El Tor pandemic strains eliminated rtxA after acquiring a classical ctxB. 

IMPORTANCE Pathogen evolution involves both gain and loss of factors that influence disease. In the environment, bacteria 
evolve rapidly, with nucleotide diversity arising by genetic modification. Such is occurring with Vibrio cholerae, exemplified by 
extensive diversity and unique variants of the rtxA-encoded multifunctional autoprocessing RTX (MARTX) toxin among 
environment-associated strains that cause localized diarrheal outbreaks and food-borne disease. In contrast, seventh pandemic 
El Tor V. cholerae strains associated with severe diarrhea have changed minimally until the altered El Tor emerged as the most 
frequent cause of cholera, including in the 2010 Haiti epidemic. These strains have increased virulence attributed to a new vari- 
ant of the major virulence factor, cholera toxin. It is revealed that these strains also have an inactivated MARTX toxin gene. A 
similar inactivation occurred during classical cholera pandemics, highlighting that evolution of El Tor cholera is following a 
similar path of increased dependence on cholera toxin, while eliminating other secreted factors. 



Received 28 December 201 2 Accepted 20 March 201 3 Published 1 6 April 201 3 

Citation Dolores J, Satchell KJF. 2013. Analysis of Vibrio cholerae genome sequences reveals unique rtxA variants in environmental strains and an rtx4-null mutation in recent 
altered El Tor isolates. mBio 4(2):e00624-12. doi:10.1 128/mBio.00624-12. 
Editor Claire Fraser, University of Maryland, School of Medicine 

Copyright © 201 3 Dolores and Satchell. This is an open-access article distributed under the terms of the Creative Commons Attribution-Noncommercial-ShareAlike 3.0 
Unported license, which permits unrestricted noncommercial use, distribution, and reproduction in any medium, provided the original author and source are credited. 
Address correspondence to Karla J. F. Satchell, k-satchell@northwestern.edu. 



1 iibrio cholerae is a diarrheal pathogen that is capable of spread- 
ing worldwide (1). The seventh cholera pandemic caused by 
0 1 El Tor strains started in Indonesia, transitioned to India, and 
has been spreading worldwide since 1960 as a clonal expansion 
from a common ancestor (1, 2). In 1993, a recombination event 
occurred in which the typical El Tor Ol strain acquired a novel O 
antigen, resulting in the 0139 epidemic (3). In 1994, "atypical" El 
Tor strains (also known as CTX-2 [2]) were identified that mod- 
ified CTXct> prophage genes and gene arrangement and that carry 
a classical-like ctxB gene, converting the primary virulence factor 
cholera toxin to the classical toxin (4). Among these atypical El 
Tor strains are strains, denoted as altered El Tor (5) (also known as 
CTX-3 [2]), that have a CTX<5 identical to the El Tor strains in 
sequence and genomic location, except the normal El Tor ctxB3 
allele has two point mutations that convert the gene to the ctxBl 



allele found in the classical strains (4). Altered El Tor strains were 
first isolated in Bangladesh in the 1990s (2, 6) and subsequently 
overtook typical El Tor as the cause of cholera in Bangladesh since 
2000 (5). These strains then accumulated an additional mutation 
in the ctxB gene, allele ctxB7 (4, 7) (also known as CTX-3b [2]), 
and this strain has subsequently been identified in India, Bangla- 
desh, Cameroon, Nepal, and Haiti (2,7-11). Compared to typical 
pandemic El Tor, the altered El Tor variants are linked to in- 
creased production of cholera toxin (12) and more severely dehy- 
drating diarrheal disease in humans (6). 

V. cholerae also exists in the environment in water and is asso- 
ciated with aquatic animals. Some of these strains can also cause 
human infections, including localized diarrheal outbreaks or 
food-borne disease (13-17). In contrast to a calculated rate of 
mutation of only 3.3 single nucleotide polymorphisms (SNPs) per 
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year for the El Tor Ol strains (2), the environmental isolates show 
extensive diversity, with >2% variation between individual iso- 
lates (8). 

The rtxA gene is the largest open reading frame (ORF) of the 
V. cholerae genome (18). All V. cholerae strains have an rtxA gene 
except for classical strains, which have a 7,869-bp deletion in the 
rtx locus that removes the 5' end of the rtxA gene as well as the rtx 
promoter region and genes for toxin maturation and secretion 
(19). 

The rtxA gene encodes a multifunctional autoprocessing RTX 
(MARTX) family toxin. Toxins in this family range in size from 
3,500 to 5,300 amino acids (aa) and are produced by various hu- 
man, animal, and insect pathogens (20). Similar to all RTX family 
proteins, the MARTX toxins are exported from the bacteria via 
C-terminal secretion signals and special dedicated type I secretion 
systems (21). The MARTX toxins encoded by different bacterial 
species are comprised of both conserved and variable domains. 
The conserved domains are characterized by long repeat regions 
called the A, B, and C repeats and by a cysteine protease domain 
(CPD) that is required for inositol hexakisphosphate-induced au- 
toprocessing of the toxins (20). The centrally located domains 
released to eukaryotic cell cytoplasm by autoprocessing are called 
"effector domains," and these carry cytotoxic and cytopathic ac- 
tivities that affect the cell biology of the target cells, including 
altered cytoskeleton assembly and signaling. A total of 10 distinct 
effector domains have been identified by sequence analysis, al- 
though a single MARTX toxin carries only from 1 to 5 of these 
effectors (20). V. cholerae has a MARTX toxin that delivers three 
effectors by autoprocessing: an actin crosslinking domain (ACD) 
that introduces an isopeptide bond between actin molecules, ren- 
dering the actin unable to assemble an actin cytoskeleton (22); a 
Rho-inactivation domain (RID) that induces actin depolymeriza- 
tion through loss of active RhoA (23); and an alpha-beta hydro- 
lyze enzyme that has not yet been characterized (20). 

Recently, we conducted an analysis of 40 biotype 1 Vibrio vul- 
nificus strains and identified four variants of the MARTX toxin, 
each of which carries a different repertoire of effector domains 
resulting in different cell biological activities and altered patho- 
genesis in mice (24). Based on that study, we were interested in 
whether V. cholerae MARTX toxins would also show diversity or if 
V. cholerae MARTX toxins would be more conserved, exclusively 
carrying the same three effector domains. 

To address the diversity of MARTX toxins in V. cholerae, pub- 
lically available complete and whole-genome sequences (WGS) 
and whole-genome SNP studies were analyzed for variation in the 
rtxA gene. We find that among both clinical and environmental 
V. cholerae, the overall domain structure of the MARTX toxin is 
highly conserved, but new variants and inactivating mutations can 
arise that affect function of the secreted toxin. 

RESULTS 

Bioinformatic analysis cannot resolve discrepancy in rtxA gene 
annotations. At the outset of this project, a problem arose iden- 
tifying rtxA genes in genomes due to conflicting annotation of the 
large gene. The original annotation of the representative genome 
V. cholerae N16961 rtxA gene designated an ATG start site that is 
23 bp downstream of the rtxC stop codon and preceded by a con- 
sensus AGAAGAG Shine-Dalgarno (SD) sequence (Fig. 1). Use of 
this start site generates a gene of 13,638 bp encoding a deduced 
protein of 4,545 aa (19). Subsequent annotation of the N16961 



complete genome defined a GTG start site within the coding re- 
gion of rtxC preceded by SD sequence GAAAGG (18). Use of this 
alternative start site generates a gene of 13,677 bp encoding a de- 
duced protein of 4,558 aa. 

A BLASTP search using the 4,558-aa translation product as a 
query against the National Center for Biotechnology Information 
(NCBI) protein database reveals 14 full-length RtxA proteins an- 
notated as part of sequencing projects since the original publica- 
tion of the N 16961 genome. Seven proteins annotated the GTG 
start site encoding a protein of 4,558 aa (MO10, M66-2, V52, 
MZO-3, MZO-2, 623-39, and AM- 19226), while the other seven 
proteins used the ATG encoding a protein of 4,545 aa (IEC224, 
NCTC 8457, 2740-80, LMA3984-4, HE-48, HE-39, and HE-25). 
The annotation of the rtxA gene of 2010EL-1786 and all closely 
related strains sequenced to investigate the Haiti cholera epidemic 
used the ATG start site, but the deduced protein product was only 
4,534 aa, indicating a truncated protein that will be discussed be- 
low. 

It was supposed that the correct start site could be resolved by 
nucleotide (nt) analysis of additional genomes to determine 
whether only one start site was conserved. A 104-nt query repre- 
senting the rtxC-rtxA intergenic region was used in a BLASTN 
search of the NCBI genome sequences database. Two allele vari- 
ants were identified. Allele 1 was 100% identical to the N 16961 
query and was present in 88 genomes. Allele 2 was present in 3 
genomes, including MZO-2, MZO-3, and V. cholerae bv. albensis 
VL426. This second allele has one SNP that introduces a silent 
change of a GTG codon in rtxC to GTA but does not affect either 
possible rtxA start sites or SD sequences (Fig. 1A). Since both start 
sites and SD elements are conserved in all genomes, bioinformat- 
ics could not resolve the proper start site. 

Experimental identification of the ATG start site for rtxA. To 
address the issue of the appropriate start site experimentally, point 
mutations in the DNA sequence on the V. cholerae N16961 chro- 
mosome I were generated and tested for MARTX toxin-induced 
cell rounding and actin cross-linking activity after addition of bac- 
teria to HeLa cells (Fig. IB and C). Mutation 1 (mutl) altered the 
putative GTG start site to GTT, a mutation that also altered a 
codon in rtxC from Trp to Leu. Mutation 2 (mut2) inserted a C in 
codon 7 of rfxA-4558, resulting in a frameshift creating a prema- 
ture stop before the SD sequence of the downstream ATG start 
site. Neither mutation to disrupt the longer rfxA-4558 ORF af- 
fected cell rounding or actin cross-linking, although actin cross- 
linking may have been slightly reduced (Fig. IB and C). Thus, the 
GTG is not the primary start site of translation, as loss of this site 
did not significantly affect toxin activity. 

Mutation 3 (mut3) altered the ATG start codon to ATC, a 
mutation that alters the rfxA-4558 translation product to have an 
He in place of Met. Disruption of the ATG start codon eliminated 
cell rounding and reduced actin cross-linking such that only actin 
dimers were generated (Fig. IB and C). To determine if the resid- 
ual cross-linking activity leading to dimer formation represents 
translation starting at the GTG or at unmapped downstream start 
sites, mutation 4 (mut4) was created to convert a Glu codon to a 
stop in both rfxA-4545 and rfxA-4558. This mutant showed no cell 
rounding and no actin cross-linking (Fig. IB and C). Overall, 
these data show that the ATG is the primary translation start site 
for rtxA, although minor activity may originate from the upstream 
GTG start site, resulting in actin dimers. 
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FIG 1 Determining the accurate start site of the rtxA gene. (A) The gene sequence for N16961 rtxC-rtxA intergenic region is shown. The start site and SD 
sequence (white boxes) and the alternate start site and SD sequence (gray boxes) are indicated. Partial translation products of rtxC and rtxA from both potential 
start sites are shown below. Arrowheads indicate locations of site-directed mutations (above sequence) and the changes introduced in the translated products 
(below sequences). (B and C) HeLa cells were treated with log-phase V. cholerae for 90 min and then were photographed (B) and collected to prepare lysates for 
immunoblot using anti-actin and anti-tubulin antibodies (C). Strains used were KFV1 19 (N16961 AhapA AhtyA), KFV92 (N16961 khapA khlyA ArtxA), and 
mutants as detailed for panel A. Actin laddering in the Western blot is marked as actin monomer (Mo), dimer (Di), trimer (Tr), and tetramer (Te). 



Identification of full-length rtxA genes in whole-genome se- 
quences. The correct-length 13,658-nt sequence of N16961 rtxA 
(nt 1550147 to 1563784; GenBank AE003852.1) was used in a 
BLASTN search of 95 complete and WGS genomes available in the 
NCBI genomes database. A total of 77 genomes were identified 
that contained a gene sequence that aligned against the full-length 
rtxA gene, while other genomes contained only rtxA gene frag- 
ments. Eight of these contained out-of-frame mutations and were 
excluded from analysis. All 69 complete rtxA genes were included 
(see Table SI in the supplemental material). 

Pre-seventh pandemic El Tor and closely related V. cholerae 
strains are linked by an SNP at nt 1558877. Assessment of rtxA 
alignments from V. cholerae El Tor strains revealed strong conser- 
vation among rtxA genes from 57 genomes. These genes could be 
assorted into seven distinct alleles, varying from representative 
strain N 16961 by only 1 to 4 bp (Table 1). To supplement the 
analysis, data from three recent genome SNP reports (2, 8, 10) 
were examined, and synapomorphic SNPs in the rtxA gene were 



identified. This analysis revealed two additional alleles confirmed 
by presence in at least three closely related isolates, raising the total 
El Tor alleles to nine (Table 1). 

Allele 6, represented by pre-seventh pandemic strain NCTC 
8457 isolated in 1910 in Saudi Arabia, differs from the represen- 
tative strain N16961 by only a G at rtxA nt 8731 (nt 1558877; 
GenBank AE003852.1). This allele is conserved in a 1986 isolate 
from Australia (BX 330286). The G8731 residue is also present 
with a second SNP in allele 7, represented by pre-seventh pan- 
demic strains M66-2 and MAK 757, and in allele 8, represented by 
1980 United States Gulf environmental isolate 2740-80. A closely 
related allele 9 from the 037 strain V52 also shares G8731 along 
with only 3 other SNPs (Table 1). These data indicate that G8731 
is a marker of rtxA genes from pre-seventh pandemic strains and 
El Tor strains that evolved independently of the seventh pan- 
demic. 

The El Tor rtxA allele 1 is highly conserved throughout the 
seventh pandemic isolates. The predominating allele among the 
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TABLE 1 Nucleotide and amino acid changes in El Tor and El Tor-like rtxA alleles 
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" ET, El Tor; PP, pre-7th pandemic; environ., isolated from environment; USG, U.S. Gulf Coast. 
b For a complete list of genomes, see Table SI in the supplemental material. 
c Green, unchanged; red, nonsynonymous change; yellow, synonymous change. 
d Blue, unchanged; red, amino acid change; X, change to stop codon. 
e As detailed in SNP analysis by Hendriksen et al. (10). 



seventh pandemic isolates is allele 1, including the representative 
strain N 16961 and typical El Tor isolates from South America, 
Kenya, and Malaysia (Table 1; see also Table S2 in the supplemen- 
tal material). The 0139 serotype has allele 1, as do the atypical El 
Tor Matlab variant MJ-1236, atypical U.S. Gulf Coast isolates 
from 1991 and 2000, and altered El Tor strains isolated in Thai- 
land and Malaysia. Expansion of the analysis to include studies 
that reported only SNPs rather than assembled genomes reveals 
that nearly all typical El Tor Ol and 0139 and atypical El Tor 
strains from Matlab and Vietnam carry rfxA allele 1 . Of note, this 
analysis also revealed that four atypical strains from Mozambique 
(2) have a single SNP at rtxA nt 3745, creating a conservative 
amino acid change from Gly to Ser, and thereby was denoted as 
allele 2. This mutation was confirmed in the published genome 
sequence of B-33 (25), which has an incomplete rtxA covering 
only 12,436 nt and was originally excluded from analysis. 

Multiple rtxA alleles arose within the altered El Tor strains. 
Altered El Tor strains are atypical El Tor strains with a classical 
ctxBl allele in an otherwise El Tor CTX<t> strain (4). These strains 
have spread worldwide as the predominant agent of cholera since 
2000 (2, 5). As evidence of emergence from the typical El Tor 
strains, many isolates of altered El Tor have rtxA allele 1 , including 
strains isolated in Thailand and Malaysia (see Table SI in the sup- 
plemental material) and strains isolated in Bangladesh in 1994 and 
India in 2006 and 2007 (2). The altered El Tor strain with rtxA 
allele 1 was one of two strains circulating in Nepal in 2010 (group 
V) (10). 

From 2005 to 2009, an altered El Tor cholera outbreak oc- 
curred in East Africa countries, including Kenya, Djibouti, and 
Tanzania. Analysis of SNP data revealed that this outbreak strain is 



marked by 6 SNPs in the rtxA gene (2) (Table 1, allele 3). These 
SNPs occur in all 21 genomes in this group, verifying this as a 
unique allele despite the absence of a complete genome. The nu- 
cleotide changes are clustered at nt 2994/2997 and nt 3071/3072/ 
3079/3084 (nt 1553141 to 1553231; GenBank AE003852.1), sug- 
gesting they may have arisen by a single homologous recombi- 
nation event rather than spontaneous mutation. Consistent with 
the extreme conservation of this gene, only two of the mutations 
are nonsynonymous, and the protein product is highly conserved 
(Table 1). 

Genomic studies revealed that a separate cluster of altered El 
Tor was isolated in Bangladesh as early as 1999 and subsequently 
transferred to India in 2004 and Haiti in 2010 (2, 10). This cluster 
is represented by sequenced strain CIRS 101 isolated in Bangladesh 
in 2002 (25). These strains do not have rtxA allele 1, similar to 
earlier circulating altered El Tor, nor do they have the allele 3 
unique to altered El Tor from the East Africa outbreak. Instead, 
this cluster has an rtxA gene that differs from N 16961 by only one 
SNP, an exchange from G to A at nt 13602 (nt 1563760; GenBank 
AE003852.1). Using gene alignment, this full-length rtxA allele 4 
was identified in CIRS101 and in 36 additional sequenced ge- 
nomes, including 22 clinical isolates from the 2010 Haiti cholera 
epidemic. Two strains with the G13602A mutation isolated in 
South Africa and Zimbabwe have a second synonymous SNP at 
12639 (nt 1562797; GenBank AE003852.1), and this was denoted 
as allele 5 (Table 1). 

Notably, the rfxA allele 4 is found in 9 sequenced altered El Tor 
strains in combination with ctxBl of the altered El Tor strain but 
also in 27 strains with a novel ctxB variant, ctxB7 (see Table S2 in 
the supplemental material). This analysis revealed that while 
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FIG 2 Restoration of MARTX toxin activity to 2010EL-1786. HeLa cells were treated with log-phase 
V. cholerae for 90 min and then were photographed (A) and collected to prepare lysates for immuno- 
blotting using anti-actin and anti-tubulin antibodies (B). Strains used were control strain KFV43 
(N16961 AhapA) and Haiti epidemic strain 2010EL-1786 either alone (left) or after integration of 
pJD22 to correct stop codon and integrate 6XHis tag (right). Actin laddering in the Western blot is 
marked as actin monomer (Mo), dimer (Di), trimer (Tr), and tetramer (Te). 



CIRS101 and isolates from Pakistan, Russia, India, and Zimbabwe 
carry rtxA allele 4 in combination with ctxBl, all other sequenced 
rtxA allele 4 strains have ctxB7. These include strains first identi- 
fied in India in 2007 and Nepal in 2009 and subsequently in Ban- 
gladesh, Cameroon, Haiti, and the Dominican Republic in 2010 
(2, 7-11). Assessment of whole-genome data and SNP analysis 
revealed no strains in which the ctxB7 mutation occurred inde- 
pendently of rtxA allele 4. These data suggest that the ctxB7 allele 
arose within the rtxA allele 4 background, henceforth called rtxA4. 

Mutation G13602A generates a premature stop codon. 
Translation of the rtxA4 allele revealed that it would change a TGG 
codon for Trp to a TGA stop codon, resulting in loss of 12 aa from 
the C terminus of the protein (Table 1). Since all RTX proteins 
depend on secretion signals at the C terminus (21), it was consid- 
ered likely that this stop codon would be an inactivating mutation. 
We obtained 2010 Haiti epidemic isolate 2010EL-1786 and tested 
it for cell rounding and actin cross-linking. In contrast to the 
control strain KFV43, 2010EL-1786 did not cause cell rounding 
and actin cross-linking (Fig. 2). 

To ensure that the loss of MARTX toxin function is due solely 
to the single rtxA point mutation, we modified the strain by inte- 
grating plasmid pJD22 at the 3' end of the gene. This integrating 
plasmid has an insert corresponding to the last 616 nt of the 
N 16961 rtxA gene, except with additional codons for a C-terminal 
6XHis tag when the plasmid integrates onto V. cholerae chromo- 
some I. We mated plasmid pJD22 into 2010EL-1786 and selected 
one merodiploid that acquired an uninterrupted ORF, with the 
2010EL-1786 TGA stop codon replaced with the N16961 TGG 
Trp codon. As a control, N 16961 -derived strain KFV43 was also 
modified to integrate the plasmid, demonstrating that addition of 
the 6XHis tag does not affect MARTX toxin activity. Gain of the 
N16961 codon sequence restored the ability of 2010EL-1786 to 
induce cell rounding and actin cross-linking (Fig. 2). The relative 
activity of the toxin expressed from 201 0EL- 1 786flpJD22 was less 
than that of the KFV43flpJD22 control, indicating that the strain 



naturally produces less toxin. Yet, these 
data demonstrate that this SNP inacti- 
vated the function of the MARTX toxin. 

rtxA varies in environmental strains. 
Among the remaining 12 genomes with 
intact rtxA genes, the genes were found to 
be highly diverse (Fig. 3). In contrast to 
the 037 strain V52 that was closely related 
to El Tor strains, the 037 strain MZO-3 
was highly diverse from N 16961, with 176 
SNPs, suggesting it evolved in a different 
lineage from V52, again consistent with 
the possibility that V52 arose from an El 
Tor Ol strain rather than linkage to the 
other 037 strains. 

The remaining 1 1 rtxA genes also dif- 
fer greatly from N 16961 by between 207 
and 302 nt or 1.5 to 2.2%. The rate of 
nonsynonymous changes was 4 to 7% 
(Fig. 3B). Close analysis of the SNP loca- 
tions suggests the rtxA gene has varied ex- 
tensively by homologous recombination. 
While many SNPs are independent in 
genomes, as would be expected if arising 
by DNA polymerase error, many changes 
occur in blocks, and these blocks are shared in multiple genomes. 

A map of the SNPs showed a bias away from the 5' end of the 
gene and the effector regions and a strong bias for changes be- 
tween the end of the B repeat and the start of the ACD effector 
domain (Fig. 3A). The function of this highly variable region is 
unknown, except that it is predicted to be alpha helical in struc- 
ture. The extent of variation could indicate that this region is not 
essential for function and thus can tolerate change as long as the 
total length is retained. Alternatively, it is possible that changes in 
these regions reflect adaptation to a specific niche, such as coloni- 
zation of distinct hosts in the environment. 

Identification of two novel MARTX variants. An additional 
search of the NCBI WGS genomes using a TBLASTN algorithm 
and excluding all genomes previously analyzed revealed two se- 
quences in which there was strong alignment to both the N and the 
C terminus to the MARTX toxin but not the center of the protein, 
although the translated sequences originated from the same ge- 
nome contig. These sequences were analyzed as potential novel 
rtxA variants (Fig. 4). 

Strain RC385 is an 0137 strain isolated from Chesapeake Bay, 
MD. It has previously been reported as having a MARTX toxin 
with a unique effector arrangement but containing a frameshift 
mutation (26). The updated rtxA sequence is now present as a 
single contig, and the frameshift is corrected to a single uninter- 
rupted ORF. The repeat regions varied from N16961 by 297 nt, 
consistent with a 3% rate of change similar to other environmental 
strains (Fig. 4B). Due to the extensive difference among strains, 
the point of exchange and precursor of RC385 rtxA could not be 
identified among the other V. cholerae environmental isolates by 
SNP analysis. 

As previously annotated (26), this MARTX toxin from RC385 
has three effectors. Effectors 1 and 3 are duplicated effectors sim- 
ilar to a domain of unknown function (DUF) shared with MARTX 
toxins from Photorhabdus spp. The second effector is predicted to 
be an adenylate cyclase found also in MARTX toxins of Vibrio 
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FIG 3 SNP map of non-El Tor V. cholerae strains. (A) The domain organization of V. cholerae MARTX 
toxin is shown below a bar diagram representing total SNPs in DNA sequence of rtxA from 12 strains 
compared to reference strain N16961. (B) All SNPs across the 13,638-bp gene are marked by a vertical 
bar. Strains are in order of increasing nucleotide diversity compared to N 16961, as indicated at right. 



caribbenthicus (NZ_AEIU01000000), Vibrio nigripulchritudo (27), 
and Vibrio anguillarum 775 (28). The putative adenylate cyclase is 
most closely related to the type III secretion effector ExoY, an 
adenylate cyclase from Pseudomonas aeruginosa (29). 

Strain HE-09 is an environmental isolate from Haiti. Annota- 
tion of the deduced amino acid sequence revealed extensive vari- 
ation in the regions aligned with N16961, with 785 SNPs across 
8,954 aligned residues (8.7% difference). The encoded MARTX 
toxin also carried a unique set of effector domains. HE-09 includes 
a DUF shared with Vibrio vulnificus and Xenorhabdus species 
MARTX toxins (20). The next effector is a Rho-inactivation do- 
main (RID). The third effector domain of the HE-09 is a previ- 
ously unknown MARTX effector. The predicted protein has a 
4HBM-type membrane localization domain that predicts that the 



effector would be targeted to anionic lipid 
membranes (30, 31). The tethered effec- 
tor showed strong homology with the Ba- 
cillus cereus VIP2 toxin family. VIP2 and 
related toxins are ADP-ribosylating tox- 
ins that modify G-actin (32). Thus, 
HE-09 is an environmental Vibrio strain 
quite distinct from the four other envi- 
ronmental V. cholerae strains also isolated 
during the Haiti epidemic and from the 
altered El Tor clinical strains, all of which 
share an effector arrangement with the 
reference strain N16961. 



DISCUSSION 

In this study, we sought to understand 
how diversity in the gene encoding the 
V. cholerae MARTX toxin might affect the 
function of the toxin both in clinical iso- 
lates and in the environment. Recent 
studies have shown that the composite 
MARTX toxin genes can be altered in 
V. vulnificus by homologous recombina- 
tion to acquire new effector domains 
(24, 33), and we were interested if similar 
variation occurred in V. cholerae. For 
genomic analysis, it was first necessary to 
reconcile discrepancies in the annotation 
of the rtxA gene of the representative 
strain N 16961. We showed experimen- 
tally that the correct start site is the ATG 
downstream of rtxC, although some ac- 
tivity may arise from the upstream GTG. 
Other Vibrio spp. have been identified 
to have rtxA genes for MARTX toxins (20). Analysis of the 
upstream sequences for rtxA genes from V. vulnificus (34-36), 
V. anguillarum (28), V. splendidus (37), and V. caribbenthicus 
(NZ_AEIU0 1000000) reveal that these each contain a conserved 
trinucleotide GTG and ATG start site as well as a conserved TAA 
stop codon for rtxC. However, the intervening sequences vary in 
comparison toN16961.Most notably, in no cases would use of the 
GTG as a start site introduce the appropriate frame of translation 
for the MARTX toxin, indicating that in all Vibrio spp., the start 
site will be conserved with the ATG of V. cholerae, as mapped in 
this study. 

Using the correct 1 3,638-bp gene, we examined diversity of this 
gene in 71 V. cholerae strains for which the entire gene was repre- 
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sented in a sequenced genome supplemented by available SNP 
data from recent whole-genome phylogeny (2, 8, 10). Whole- 
genome studies have revealed that the El Tor Ol lineage differs 
from N16961, with a calculated rate of only 3.3 SNPs/year (2). Not 
surprisingly, given the global conservation of these strains, few 
SNPs were identified in the rtxA gene despite the 100-year history 
of available data. In total, only 9 rtxA alleles were identified across 
all El Tor Ol, 0139, atypical, and altered El Tor strains. 

All of the El Tor seventh pandemic strains shared an adenosine 
nucleoside at rtxA nt 8731, distinct from a guanosine present in 
pre-seventh pandemic strains, other El Tor-like lineage strains 
from the United States Gulf Coast and Australia, and the 037 
strain V52. The nucleoside change results in a codon change from 
glycine in prepandemic strains to serine in pandemic strains 
within the RID effector domain. Notably, among other MARTX 
toxins with an RID, a Gly at this position is most prevalent (38). 
This finding of a conservative mutation unique to the El Tor sev- 
enth pandemic MARTX RID is consistent with the El Tor strains, 
representing a clonal expansion from a single predecessor at some 
point in the 1950s diverging away from other El Tor-like lineages 
(2). All together, the strong conservation of this gene, with only 1 
to 6 SNPs across 100 years and worldwide geography, reinforces 
that these strains are closely related evolutionarily and distinct 
from other V. cholerae isolates. 

The conservation of the rtxA sequence in the El Tor Ol and 
closely related strains is in stark contrast to the extensive variation 
in the non-El Tor group, whether the strain was of environmental 
or clinical origin. The database for these strains is more limited, 
although each of the 12 isolates had a unique vtxA gene that dif- 
fered from the reference strain by between 176 and 321 nt, or 1 to 
2%. This variation matches the calculated 2% rate of change for 
whole-genome variation of 3 environmental isolates recently re- 
ported (8). This dramatic variation was not expected, because in a 
recent multi-virulence locus sequence typing analysis, including 
both Ol and non-Ol strains, non-0139 isolates showed limited 
variation in the rtxA gene (39). Interestingly, this analysis was 
focused to 332 bp from nt 1 339 to 1 67 1 , and assessment of the SNP 
map for our analysis showed limited SNPs in this region (Fig. 4A). 
Conservation in this region is in contrast to the spacer region 
between the B repeats and the ACD effector domain where exten- 
sive variation occurred, indicating that conservation of rtxA, as 
previously reported (39), was biased by the region analyzed and 
does not reflect total diversity across the large gene. Many of the 
SNPs in the genomes seemed to arise in clusters across several 
genomes, suggesting that variation in this gene arises extensively 
from homologous recombination and the regions amenable to 
homologous recombination may be defined in some way by DNA 
structure or nucleotide sequence. In all, in contrast to the lack of 
variation within the El Tor group, there is a large capacity for the 
rtxA gene to vary in the environmental strains. 

Two genes were identified that are new toxin variants due to a 
change in the central portion of rtxA that encodes the effectors. 
These changes converted one of the toxins to a potential adenylate 
cyclase toxin and another to a putative actin ADP-ribosylating 
toxin. Due to extensive nucleotide diversity in the conserved re- 
gions, the site of recombination events could not be identified. 
The consequences of these genetic changes will need to be further 
explored in the future, as will whether these changes affected the 
biological niche these strains occupy. 

The most clinically relevant change in the rtxA toxin gene was 



one SNP that emerged within the currently circulating strains 
known as the altered El Tor strain. This SNP inactivated the func- 
tion of the MARTX toxin by introducing a premature stop codon 
that would result in a protein truncated by 12 amino acids, prob- 
ably disrupting the C-terminal secretion signal. Strains 201 1EL- 
1098, 2009EV-1 131, and 201 1EL-1132 were deleted from analysis 
for out-of- frame mutations, but if these mutations could be vali- 
dated, they would indicate that this group of strains is accumulat- 
ing additional null mutations in rtxA, generating a pseudogene 
that could not be reversed by a single nucleotide reversion. 

Subsequent to the emergence of rtxA allele 4, another signifi- 
cant mutation apparently arose within the rtxA-null background 
in which a point mutation in ctxB created a CtxB with a never- 
before-observed Asn at aa 20. As no strains were identified in 
which this ctxB7 allele occurred with an intact rtxA gene, it is 
suggested that ctxB7 arose within the rfxA-null background. This 
genetic cluster is represented by 27 different sequenced strains and 
extensive additional WGS analysis, demonstrating that strains 
with both mutations have been transmitted globally, reaching In- 
dia, Nepal, Cameroon, Haiti, and the Dominican Republic (2, 
8-11). The data also indicate that in addition to comprising a 
clonal cluster that may have originated in India (7), these strains 
have a distinct virulence factor profile from other El Tor strains (6, 
12). Therefore, it is necessary when categorizing the altered El Tor 
to distinguish not only the ctxB allele but also the rtxA allele to 
determine if the strain produces an active MARTX toxin. It is yet 
to be determined which of these mutations, if either, can account 
for the increased virulence of these strains in humans (6), al- 
though 2010EL-1786isnot statistically different in virulence from 
typical El Tor strains in infant mice (12) or in adult mice (our 
unpublished findings). 

Interestingly, the classical strains responsible for historical 
cholera pandemics are also noted for a naturally occurring dele- 
tion that removes > 7 kb of the rtx locus, deactivating the MARTX 
toxin (19). The question then is, as strains become more virulent 
and adapted for human-to-human passage, why do they deacti- 
vate MARTX? The toxin is very large and may be detrimental to 
growth due to energy expenditure reducing rapid growth neces- 
sary for increased dissemination. If the toxin has a role in the 
environment, it may no longer be necessary if the strain acquires 
the ability to transfer host to host, making circulation through the 
environment less essential. The toxin has been shown in mice to 
have a function in virulence by promoting colonization of the 
intestine through compromising the innate immune system dur- 
ing early colonization (40, 41). However, it has been shown that 
the toxin is fully redundant in function with hemolysin, a pore- 
forming toxin (42). Strain 2010EL-1786, the representative strain 
used in this study, has been reported to be hemolytic, similar to 
other El Tor strains (12), and we independently confirmed this 
result (data not shown) . Thus, the loss of function of the rtxA gene 
may be circumvented by the action of secreted hemolysin. 

Yet, classical strains are nonhemolytic in addition to not pro- 
ducing a MARTX toxin. It has been postulated that innate im- 
mune evasion is a key function of CT, in addition to its ability to 
induce enterocyte secretion (43). It is possible that in the context 
of excess production of classical-type CT, as occurs in classical 
strains and the altered El Tor strains now circulating, neither 
MARTX nor hemolysin is required for innate immune evasion, 
their function being replaced by the immunomodulatory function 
of the classical form of CtxB. 
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Overall, we have found that the MARTX toxin in general has 
been highly conserved in V. cholerae, but when variants do arise, 
they can dramatically alter protein function and possibly overall 
process of pathogenesis. Thus, while it is clear that analysis of 
whole genomes can provide phylogenetic evidence of how strains 
are evolving over time, detailed analysis of individual genes is still 
necessary to reveal how small changes in nucleotide sequence can 
dramatically affect protein function, and these mutations may 
contribute to increased fitness and global spread. 

MATERIALS AND METHODS 

Genome sequences and analysis. All genome sequences used for this 
analysis were publically available at NCBI (http://www.ncbi.nlm.nih 
.gov/). The database included 95 complete and incomplete (i.e., WGS) 
genomes as of 3 October 2012. A slightly expanded database was reexam- 
ined on 10 March 2013. Genomes selected for analysis are listed in Ta- 
ble S 1 in the supplemental material with NCBI accession numbers. SNP 
data from published supplemental data were used to augment analysis of 
the El Tor pandemic strains. All genome database searches were con- 
ducted using BLAST interfaces within the NCBI website. Multiple align- 
ments of identified sequences were performed by the ClustalW algorithm 
using the MacVector 12.6.0 software package. SNPs were diagrammed by 
importing sequences into Excel to compare individual SNPs. These were 
then graphed using Prism for Macintosh 4.0. 

Strains and reagents. V. cholerae strain KFV119 (N16961 AhlyA 
AhapA) and KFV43 (N16961 AhapA) are from our collection (44, 45), 
and 2010EL-1786 (9) was obtained from the American Type Culture Col- 
lection (ATCC BAA2163). All strains were cultured on Luria-Bertani agar 
or broth supplemented with 100 u,g/ml streptomycin, 50 jug/ml ampicil- 
lin, or 5% sucrose as necessary. Cells were cultured in Dulbecco's minimal 
Eagle medium (DMEM; Life Technologies) containing 10% fetal bovine 
serum (FBS; Gemini Bioproducts or Atlanta Biologicals) and in the pres- 
ence of penicillin and streptomycin antibiotics. All custom primers for 
DNA amplification and site-directed mutagenesis were obtained from 
Genosys (The Woodlands, TX) or Integrated DNA Technologies (Cor- 
alville, IA) and are listed in Table S2 in the supplemental material. En- 
zymes were obtained from New England Biolabs and chemicals and media 
from Sigma or Fisher. Sequencing was conducted at the Northwestern 
University Genomics Core Facility. 

Generation of modified rtxA promoter sequences in N16961. To 
generate site-directed mutants in the rtxA promoter, DNA encompassing 
gene sequences from VC1449, rtxC, and the 5' end of rtxA was amplified. 
During amplification, the sequence was modified using crossover PCR to 
insert 6 nt for an EcoRV site immediately adjacent to the ATG putative 
start site for rtxA using techniques as previously described (44). The prod- 
uct was cleaved within rtxC by BspMl to reduce the fragment size, con- 
verted to blunt ends using Klenow, and cloned into pCR2.1 (Invitrogen). 
This clone was digested with restriction enzymes Spel and Xhol and 
moved into the similarly digested sacB counterselectable plasmid pWM9 1 

(46) to generate plasmid plD12. The modified rtxA sequence was recom- 
bined onto the V. cholerae KFV1 19 chromosome I by double homologous 
recombination with sucrose counterselection as previously described 
(44), generating V. cholerae strain JD8. Gain of the EcoRV site was con- 
firmed by PCR amplification across the region and subsequent processing 
of the PCR product with EcoRV. A second pWM91 -based plasmid, 
pJDll, encompassing the same region without the EcoRV site was also 
generated and then modified as detailed in Results using the QuikChange 
II site-directed mutagenesis kit (Agilent Technologies, Santa Clara, CA) . 
These single-nucleotide changes (mutl to mut4) were exchanged into 
strain JD8 as described above, except that strains that gained the point 
mutations were identified by loss of the EcoRV cleavage and by sequenc- 
ing of the PCR products. 

Restoration of the stop codon in 2010EL-1786 using plasmid inte- 
gration. Plasmid pKJF344 is a derivative of integrating plasmid pGP704 

(47) that has a segment of N 16961 DNA corresponding to the 3' end of the 



rtxA gene, except the sequence was modified during amplification to in- 
clude 1 8 nt for a 6 X His C-terminal tag immediately before the TAA stop 
codon. It has been previously shown that addition of six histidines to the 
MARTX toxin does not affect function (48). The plasmid was transferred 
to V. cholerae 2010EL-1786 or KFV119 by mating from Escherichia coli 
SMIOApir, selecting for single homologous integration of the plasmid by 
gain of ampicillin resistance. Isolated colonies were screened by amplify- 
ing the integrated insert and sequencing to ensure that the recombination 
event resulted in gain of the N 16961 Trp codon replacing the stop codon 
in2010EL-1796. 

Assessment of cell rounding and actin cross-linking. A total of 10 5 
HeLa cells were seeded into 6-well culture dishes. Before assay, medium 
was exchanged for DMEM without antibiotics or FBS. Bacteria grown to 
logarithmic phase were washed in phosphate-buffered saline and added to 
wells at a multiplicity of infection of 20. Plates were incubated under 5% 
C0 2 at 37°C for 90 min. Cells were viewed using a Nikon T200 phase 
microscope and photographed using a Nikon E995 digital camera. Cells 
were collected by scraping, pelleted at 500 X g, and lysed by boiling in 
SDS-PAGE buffer. Cross-linking of actin was assessed by Western blot- 
ting using anti-actin and anti-tubulin monoclonal antibodies (Sigma) as 
previously described (38). 

SUPPLEMENTAL MATERIAL 

Supplemental material for this article may be found at http://mbio.asm.org 
/lookup/suppl/doi: 10.11 28/mBio.00624- 1 2/-/DCSupplemental. 

Table SI, DOCX file, 0.1 MB. 

Table S2, DOCX file, 0.1 MB. 
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